DISCUSSION : THE DANTZIG SELECTOR : STATISTICAL ESTIMATION WHEN p IS MUCH LARGER THAN

نویسنده

  • JINCHI LV
چکیده

Professors Candès and Tao are to be congratulated for their innovative and valuable contribution to high-dimensional sparse recovery and model selection. The analysis of vast data sets now commonly arising in scientific investigations poses many statistical challenges not present in smaller scale studies. Many of these data sets exhibit sparsity where most of the data corresponds to noise and only a small fraction is of interest. The needs of this research have excited much interest in the statistical community. In particular, high-dimensional model selection has attracted much recent attention and has become a central topic in statistics. The main difficulty of such a problem comes from collinearity between the predictor variables. It is clear from the geometric point of view that the collinearity increases as the dimensionality grows. A common approach taken in the statistics literature is the penalized likelihood, for example, Lasso (Tibshirani [11]) and adaptive Lasso (Zou [12]), SCAD (Fan and Li [7] and Fan and Peng [9]) and nonnegative garrote (Breiman [1]). Commonly used algorithms include LARS (Efron, Hastie, Johnstone and Tibshirani [6]), LQA (Fan and Li [7]) and MM (Hunter and Li [10]). In the present paper, Candès and Tao take a new approach, called the Dantzig selector, which uses 1-minimization with regularization on the residuals. One promising fact is that the Dantzig selector solves a linear program, usually faster than the existing methods. In addition, the authors establish that, under the Uniform Uncertainty Principle (UUP), with large probability the Dantzig selector mimics the risk of the oracle estimator up to a logarithmic factor logp, where p denotes the number of variables. We appreciate the opportunity to comment on several aspects of this article. Our discussion here will focus on four issues: (1) connection to sparse signal recovery in the noiseless case; (2) the UUP condition and identifiability of the model; (3) computation and model selection; (4) minimax rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DISCUSSION : THE DANTZIG SELECTOR : STATISTICAL ESTIMATION WHEN p IS MUCH LARGER THAN

given just a single parameter t . Two active-set methods were described in [11], with some concern about efficiency if p were large, where X is n× p . Later when basis pursuit de-noising (BPDN) was introduced [2], the intention was to deal with p very large and to allow X to be a sparse matrix or a fast operator. A primal–dual interior method was used to solve the associated quadratic program, ...

متن کامل

DISCUSSION : THE DANTZIG SELECTOR : STATISTICAL ESTIMATION WHEN p IS MUCH LARGER THAN

1. Introduction. This is a fascinating paper on an important topic: the choice of predictor variables in large-scale linear models. A previous paper in these pages attacked the same problem using the " LARS " algorithm (Efron, Hastie, Johnstone and Tibshirani [3]); actually three algorithms including the Lasso as middle case. There are tantalizing similarities between the Dantzig Selector (DS) ...

متن کامل

The Dantzig selector : statistical estimation when p is much larger than

In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax+ z, where x ∈ R is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n p, and the zi’s are i.i.d. N(0, σ). Is it possible to estimate x reliably based on the noisy data y? T...

متن کامل

THE DANTZIG SELECTOR : STATISTICAL ESTIMATION WHEN p IS MUCH LARGER THAN

s n log p, where s is the dimension of the sparsest model. These are, respectively, the conditions of this paper using the Dantzig selector and those of Bunea, Tsybakov and Wegkamp [2] and Meinshausen and Yu [9] using the Lasso. Strictly speaking, Bunea, Tsybakov and Wegkamp consider only prediction, not l2 loss, but in a paper in preparation with Ritov and Tsybakov we show that the spirit of t...

متن کامل

THE DANTZIG SELECTOR : STATISTICAL ESTIMATION WHEN p IS MUCH LARGER THAN n

In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Xβ + z, where β ∈ R is a parameter vector of interest, X is a data matrix with possibly far fewer rows than columns, n p, and the zi’s are i.i.d. N(0, σ). Is it possible to estimate β reliably based on the noisy data y? ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007